A direct method for computing extreme value (Gumbel) parameters for gapped biological sequence alignments

نویسندگان

  • Terrance Quinn
  • Zachariah Sinkala
چکیده

We develop a general method for computing extreme value distribution (Gumbel, 1958) parameters for gapped alignments. Our approach uses mixture distribution theory to obtain associated BLOSUM matrices for gapped alignments, which in turn are used for determining significance of gapped alignment scores for pairs of biological sequences. We compare our results with parameters already obtained in the literature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Score distributions of gapped multiple sequence alignments down to the low-probability tail.

Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution ...

متن کامل

The Statistics of Semi-Probabilistic Alignment

Computer-assisted sequence comparison has become an integral part of modern molecular biology. Two types of algorithms have been used: those which search for the optimal alignment (as exemplified by the Smith-Waterman algorithm [1]), and those which identify likely alignments (as exemplified by the HMM-based “Sequence Alignment Modules” [2]). In each case, the quality of alignment is summarized...

متن کامل

Reproductive toxicology. Trichloroethylene.

Background: The optimal score for ungapped local alignments of infinitely long random sequences is known to follow a Gumbel extreme value distribution. Less is known about the important case, where gaps are allowed. For this case, the distribution is only known empirically in the highprobability region, which is biologically less relevant. Results: We provide a method to obtain numerically the ...

متن کامل

On Moments of the Concomitants of Classic Record Values and Nonparametric Upper Bounds for the Mean under the Farlie-Gumbel-Morgenstern Model

In a sequence of random variables, record values are observations that exceed or fall below the current extreme value.Now consider a sequence of pairwise random variables  {(Xi,Yi), i>=1}, when the experimenter is interested in studying just thesequence of records of the first component, the second component associated with a record value of the first one is termed the concomitant of that ...

متن کامل

Rapid Assessment of Extremal Statistics for Gapped Local Alignment

The statistical significance of gapped local alignments is characterized by analyzing the extremal statistics of the scores obtained from the alignment of random amino acid sequences. By identifying a complete set of linked clusters, "islands," we devise a method which accurately predicts the extremal score statistics by using only one to a few pairwise alignments. The success of our method rel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International journal of bioinformatics research and applications

دوره 10 2  شماره 

صفحات  -

تاریخ انتشار 2014